{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 活动热度数据（event_attendees.csv）处理\n",
    "（只取训练集和测试集中出现的用户ID）\n",
    "\n",
    "数据来源于Kaggle竞赛：Event Recommendation Engine Challenge，根据\n",
    "events they’ve responded to in the past\n",
    "user demographic information\n",
    "what events they’ve seen and clicked on in our app\n",
    "用户对某个活动是否感兴趣\n",
    "\n",
    "竞赛官网：\n",
    "https://www.kaggle.com/c/event-recommendation-engine-challenge/data\n",
    "\n",
    "\n",
    "event_attendees.csv文件：共5维特征\n",
    "event_id：活动ID\n",
    "yes, maybe, invited, and no：以空格隔开的用户列表，\n",
    "分别表示该活动参加的用户、可能参加的用户，被邀请的用户和不参加的用户."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 导入工具包"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "import numpy as np\n",
    "import scipy.sparse as ss\n",
    "import scipy.io as sio\n",
    "\n",
    "#保存数据\n",
    "import pickle\n",
    "\n",
    "from sklearn.preprocessing import normalize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "总的用户数目超过训练集和测试集中的用户，\n",
    "为节省处理时间和内存，先去处理train和test，得到竞赛需要用到的事件和用户\n",
    "然后对在训练集和测试集中出现过的事件和用户建立新的ID索引\n",
    "先运行user_event.ipynb,\n",
    "得到事件列表文件：PE_userIndex.pkl"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 读取之前算好的测试集和训练集中出现过的活动"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "number of events in train & test :13418\n"
     ]
    }
   ],
   "source": [
    "#读取训练集和测试集中出现过的事件列表\n",
    "eventIndex = pickle.load(open(\"PE_eventIndex.pkl\", 'rb'))\n",
    "n_events = len(eventIndex)\n",
    "\n",
    "print(\"number of events in train & test :%d\" % n_events)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{b'1723821298': 0,\n",
       " b'3119717504': 1,\n",
       " b'4176588019': 2,\n",
       " b'3576964854': 3,\n",
       " b'3434452970': 4,\n",
       " b'766559591': 5,\n",
       " b'820281725': 6,\n",
       " b'1238456614': 7,\n",
       " b'1221822553': 8,\n",
       " b'379699270': 9,\n",
       " b'598708806': 10,\n",
       " b'4259189014': 11,\n",
       " b'3135302883': 12,\n",
       " b'2175249529': 13,\n",
       " b'866120598': 14,\n",
       " b'2362894171': 15,\n",
       " b'1188335100': 16,\n",
       " b'2522645669': 17,\n",
       " b'3898095594': 18,\n",
       " b'3282344594': 19,\n",
       " b'3394798379': 20,\n",
       " b'2517697168': 21,\n",
       " b'2508140005': 22,\n",
       " b'1561841388': 23,\n",
       " b'1574620718': 24,\n",
       " b'154699280': 25,\n",
       " b'106517927': 26,\n",
       " b'1173255704': 27,\n",
       " b'3751042480': 28,\n",
       " b'1173584072': 29,\n",
       " b'3586952726': 30,\n",
       " b'3630632197': 31,\n",
       " b'1795987156': 32,\n",
       " b'1671772898': 33,\n",
       " b'651174684': 34,\n",
       " b'4261753049': 35,\n",
       " b'918116457': 36,\n",
       " b'3906315254': 37,\n",
       " b'1202113718': 38,\n",
       " b'2845303452': 39,\n",
       " b'3791426953': 40,\n",
       " b'2508555495': 41,\n",
       " b'3680303526': 42,\n",
       " b'102952604': 43,\n",
       " b'1096641829': 44,\n",
       " b'1955207789': 45,\n",
       " b'3334704715': 46,\n",
       " b'1601859970': 47,\n",
       " b'1013330853': 48,\n",
       " b'734470389': 49,\n",
       " b'1262111436': 50,\n",
       " b'1282190480': 51,\n",
       " b'3604535108': 52,\n",
       " b'183418112': 53,\n",
       " b'1095332232': 54,\n",
       " b'2678620069': 55,\n",
       " b'1665437357': 56,\n",
       " b'712219219': 57,\n",
       " b'16744769': 58,\n",
       " b'321116395': 59,\n",
       " b'1862970065': 60,\n",
       " b'570808303': 61,\n",
       " b'4246557753': 62,\n",
       " b'3097171667': 63,\n",
       " b'8816449': 64,\n",
       " b'3515877913': 65,\n",
       " b'2096635407': 66,\n",
       " b'1101073006': 67,\n",
       " b'2625705319': 68,\n",
       " b'4267881656': 69,\n",
       " b'524730890': 70,\n",
       " b'1978124272': 71,\n",
       " b'1080514315': 72,\n",
       " b'3523548061': 73,\n",
       " b'156790944': 74,\n",
       " b'4011956251': 75,\n",
       " b'1145767321': 76,\n",
       " b'3348456289': 77,\n",
       " b'3593566246': 78,\n",
       " b'85254966': 79,\n",
       " b'583350587': 80,\n",
       " b'1752177173': 81,\n",
       " b'490571154': 82,\n",
       " b'2056684630': 83,\n",
       " b'3911022906': 84,\n",
       " b'2858902069': 85,\n",
       " b'2998407293': 86,\n",
       " b'732598129': 87,\n",
       " b'593742989': 88,\n",
       " b'3441714081': 89,\n",
       " b'3420542143': 90,\n",
       " b'1720699455': 91,\n",
       " b'3073399875': 92,\n",
       " b'2994155492': 93,\n",
       " b'3124721005': 94,\n",
       " b'2512540415': 95,\n",
       " b'3182833252': 96,\n",
       " b'2437756221': 97,\n",
       " b'4259367757': 98,\n",
       " b'4215686475': 99,\n",
       " b'1294481466': 100,\n",
       " b'2468421822': 101,\n",
       " b'2252497386': 102,\n",
       " b'4181789141': 103,\n",
       " b'1324046766': 104,\n",
       " b'3147231818': 105,\n",
       " b'3966643350': 106,\n",
       " b'1913672917': 107,\n",
       " b'2078051662': 108,\n",
       " b'3924788609': 109,\n",
       " b'1711030046': 110,\n",
       " b'738179988': 111,\n",
       " b'2076577222': 112,\n",
       " b'1932611639': 113,\n",
       " b'3227431115': 114,\n",
       " b'1538008083': 115,\n",
       " b'1205391785': 116,\n",
       " b'732997628': 117,\n",
       " b'1488264230': 118,\n",
       " b'810563397': 119,\n",
       " b'3342533717': 120,\n",
       " b'1666413374': 121,\n",
       " b'2873343645': 122,\n",
       " b'2597857228': 123,\n",
       " b'1868620616': 124,\n",
       " b'52639264': 125,\n",
       " b'872496975': 126,\n",
       " b'2275768334': 127,\n",
       " b'913255694': 128,\n",
       " b'2579139880': 129,\n",
       " b'263582981': 130,\n",
       " b'813240354': 131,\n",
       " b'3687555234': 132,\n",
       " b'2286133717': 133,\n",
       " b'2289163846': 134,\n",
       " b'1949386711': 135,\n",
       " b'1797987592': 136,\n",
       " b'1247486859': 137,\n",
       " b'2078652440': 138,\n",
       " b'3229047715': 139,\n",
       " b'2350936091': 140,\n",
       " b'785245384': 141,\n",
       " b'1432188744': 142,\n",
       " b'3406035843': 143,\n",
       " b'290465119': 144,\n",
       " b'2808951120': 145,\n",
       " b'1134365310': 146,\n",
       " b'4005157099': 147,\n",
       " b'553781672': 148,\n",
       " b'1438263468': 149,\n",
       " b'1027412797': 150,\n",
       " b'1202898307': 151,\n",
       " b'4107748684': 152,\n",
       " b'550630680': 153,\n",
       " b'1620415785': 154,\n",
       " b'3731940597': 155,\n",
       " b'2696856891': 156,\n",
       " b'1576945754': 157,\n",
       " b'1902921043': 158,\n",
       " b'1613322504': 159,\n",
       " b'3194641622': 160,\n",
       " b'646048625': 161,\n",
       " b'3858844131': 162,\n",
       " b'2704757206': 163,\n",
       " b'1507640424': 164,\n",
       " b'3816147170': 165,\n",
       " b'1900681884': 166,\n",
       " b'299923852': 167,\n",
       " b'2703345903': 168,\n",
       " b'2258490433': 169,\n",
       " b'2932009302': 170,\n",
       " b'4130722320': 171,\n",
       " b'678132679': 172,\n",
       " b'2164665174': 173,\n",
       " b'3706170359': 174,\n",
       " b'154434302': 175,\n",
       " b'175229279': 176,\n",
       " b'2093694890': 177,\n",
       " b'3282233282': 178,\n",
       " b'4154028951': 179,\n",
       " b'1692430260': 180,\n",
       " b'1730476324': 181,\n",
       " b'2282236511': 182,\n",
       " b'813746591': 183,\n",
       " b'127649684': 184,\n",
       " b'2660812160': 185,\n",
       " b'2006570293': 186,\n",
       " b'3779036237': 187,\n",
       " b'1749744109': 188,\n",
       " b'1042633676': 189,\n",
       " b'970449428': 190,\n",
       " b'3352558065': 191,\n",
       " b'127371840': 192,\n",
       " b'1047438940': 193,\n",
       " b'2497272063': 194,\n",
       " b'2307150378': 195,\n",
       " b'1973598569': 196,\n",
       " b'4193393034': 197,\n",
       " b'2751638880': 198,\n",
       " b'2370228216': 199,\n",
       " b'1598393308': 200,\n",
       " b'2043699274': 201,\n",
       " b'3866708665': 202,\n",
       " b'299197733': 203,\n",
       " b'1985153611': 204,\n",
       " b'3783800124': 205,\n",
       " b'3483003454': 206,\n",
       " b'2259324736': 207,\n",
       " b'2514143386': 208,\n",
       " b'2374455326': 209,\n",
       " b'2259285302': 210,\n",
       " b'4138736243': 211,\n",
       " b'1394487790': 212,\n",
       " b'3141008702': 213,\n",
       " b'1203717384': 214,\n",
       " b'2805166226': 215,\n",
       " b'942844702': 216,\n",
       " b'721698640': 217,\n",
       " b'3192741670': 218,\n",
       " b'424727744': 219,\n",
       " b'2398136361': 220,\n",
       " b'1215962379': 221,\n",
       " b'2431677639': 222,\n",
       " b'2306874322': 223,\n",
       " b'2409266493': 224,\n",
       " b'3875424137': 225,\n",
       " b'1707508219': 226,\n",
       " b'3447361158': 227,\n",
       " b'2681049528': 228,\n",
       " b'1584909205': 229,\n",
       " b'4101475674': 230,\n",
       " b'2174159620': 231,\n",
       " b'1564670110': 232,\n",
       " b'3819599403': 233,\n",
       " b'1500592227': 234,\n",
       " b'3323741493': 235,\n",
       " b'1210091631': 236,\n",
       " b'3253541195': 237,\n",
       " b'4045444993': 238,\n",
       " b'4205511631': 239,\n",
       " b'1766688413': 240,\n",
       " b'510790020': 241,\n",
       " b'486447403': 242,\n",
       " b'1630977665': 243,\n",
       " b'1439164638': 244,\n",
       " b'2463159061': 245,\n",
       " b'2062562875': 246,\n",
       " b'214089778': 247,\n",
       " b'3216437587': 248,\n",
       " b'388060831': 249,\n",
       " b'3382385081': 250,\n",
       " b'2050237599': 251,\n",
       " b'2259674237': 252,\n",
       " b'1909232786': 253,\n",
       " b'3184899378': 254,\n",
       " b'2074041241': 255,\n",
       " b'949540452': 256,\n",
       " b'2725919614': 257,\n",
       " b'2914054397': 258,\n",
       " b'3029325544': 259,\n",
       " b'1495214616': 260,\n",
       " b'1379380863': 261,\n",
       " b'1555528731': 262,\n",
       " b'565075405': 263,\n",
       " b'3444724802': 264,\n",
       " b'414811541': 265,\n",
       " b'3803531290': 266,\n",
       " b'3110219871': 267,\n",
       " b'3192664309': 268,\n",
       " b'19341280': 269,\n",
       " b'1562797526': 270,\n",
       " b'571838530': 271,\n",
       " b'954819298': 272,\n",
       " b'1710224312': 273,\n",
       " b'1786898764': 274,\n",
       " b'4262188500': 275,\n",
       " b'3557171468': 276,\n",
       " b'4206300929': 277,\n",
       " b'4217652951': 278,\n",
       " b'3260945424': 279,\n",
       " b'3418302627': 280,\n",
       " b'144722539': 281,\n",
       " b'3287951899': 282,\n",
       " b'3537990440': 283,\n",
       " b'3565166403': 284,\n",
       " b'1595835882': 285,\n",
       " b'2422863370': 286,\n",
       " b'2151031983': 287,\n",
       " b'3213813029': 288,\n",
       " b'1769821149': 289,\n",
       " b'2933717037': 290,\n",
       " b'3415274331': 291,\n",
       " b'4285356922': 292,\n",
       " b'2108986641': 293,\n",
       " b'626198729': 294,\n",
       " b'2051111588': 295,\n",
       " b'3476135728': 296,\n",
       " b'600715346': 297,\n",
       " b'2465528861': 298,\n",
       " b'4233837355': 299,\n",
       " b'24645215': 300,\n",
       " b'4083904591': 301,\n",
       " b'3374363087': 302,\n",
       " b'1311237184': 303,\n",
       " b'1748949720': 304,\n",
       " b'1019287013': 305,\n",
       " b'1468446505': 306,\n",
       " b'4032794844': 307,\n",
       " b'3855652876': 308,\n",
       " b'1494587743': 309,\n",
       " b'2787953346': 310,\n",
       " b'2479111332': 311,\n",
       " b'3818612808': 312,\n",
       " b'1651296381': 313,\n",
       " b'3979329642': 314,\n",
       " b'3720846471': 315,\n",
       " b'878944665': 316,\n",
       " b'3274802131': 317,\n",
       " b'623252479': 318,\n",
       " b'23065304': 319,\n",
       " b'936471095': 320,\n",
       " b'2619521100': 321,\n",
       " b'1505832167': 322,\n",
       " b'4139045662': 323,\n",
       " b'3498692897': 324,\n",
       " b'2372451104': 325,\n",
       " b'806696202': 326,\n",
       " b'3978621008': 327,\n",
       " b'4276436069': 328,\n",
       " b'1434358511': 329,\n",
       " b'1102866493': 330,\n",
       " b'3541811987': 331,\n",
       " b'4145665238': 332,\n",
       " b'2147100185': 333,\n",
       " b'3154940382': 334,\n",
       " b'3416395266': 335,\n",
       " b'125579007': 336,\n",
       " b'1406192615': 337,\n",
       " b'3694263991': 338,\n",
       " b'3976146754': 339,\n",
       " b'1940922842': 340,\n",
       " b'1896981357': 341,\n",
       " b'4047087562': 342,\n",
       " b'1347399160': 343,\n",
       " b'2144300585': 344,\n",
       " b'4104785185': 345,\n",
       " b'3764071536': 346,\n",
       " b'3903656386': 347,\n",
       " b'3323980704': 348,\n",
       " b'2238892602': 349,\n",
       " b'3850944985': 350,\n",
       " b'2277754128': 351,\n",
       " b'604068853': 352,\n",
       " b'880757724': 353,\n",
       " b'3879238985': 354,\n",
       " b'237764772': 355,\n",
       " b'2124576111': 356,\n",
       " b'126225732': 357,\n",
       " b'3919284079': 358,\n",
       " b'3208240371': 359,\n",
       " b'47566099': 360,\n",
       " b'1213874055': 361,\n",
       " b'3395117104': 362,\n",
       " b'3267660718': 363,\n",
       " b'3376318184': 364,\n",
       " b'2315973008': 365,\n",
       " b'758156246': 366,\n",
       " b'3330051251': 367,\n",
       " b'329123657': 368,\n",
       " b'2301567733': 369,\n",
       " b'2781837253': 370,\n",
       " b'4102873083': 371,\n",
       " b'3834889020': 372,\n",
       " b'215555826': 373,\n",
       " b'2714222740': 374,\n",
       " b'2519922801': 375,\n",
       " b'3376765194': 376,\n",
       " b'2149464820': 377,\n",
       " b'1586611723': 378,\n",
       " b'3101776934': 379,\n",
       " b'1541067257': 380,\n",
       " b'3631810027': 381,\n",
       " b'795158097': 382,\n",
       " b'2540229116': 383,\n",
       " b'3096404071': 384,\n",
       " b'74229557': 385,\n",
       " b'352713895': 386,\n",
       " b'2655011333': 387,\n",
       " b'1520252149': 388,\n",
       " b'521858421': 389,\n",
       " b'3115754876': 390,\n",
       " b'483049188': 391,\n",
       " b'3590775294': 392,\n",
       " b'2255036161': 393,\n",
       " b'3201007806': 394,\n",
       " b'1811322299': 395,\n",
       " b'3287069160': 396,\n",
       " b'738069322': 397,\n",
       " b'1396631048': 398,\n",
       " b'2283367091': 399,\n",
       " b'3413747696': 400,\n",
       " b'303750199': 401,\n",
       " b'1281642334': 402,\n",
       " b'3933535452': 403,\n",
       " b'1640673752': 404,\n",
       " b'1551106149': 405,\n",
       " b'2198898969': 406,\n",
       " b'1232655838': 407,\n",
       " b'3644643572': 408,\n",
       " b'494342684': 409,\n",
       " b'3810234126': 410,\n",
       " b'539541742': 411,\n",
       " b'3063189490': 412,\n",
       " b'389818573': 413,\n",
       " b'601922837': 414,\n",
       " b'3290633708': 415,\n",
       " b'3133239997': 416,\n",
       " b'2025752984': 417,\n",
       " b'944508163': 418,\n",
       " b'3644291653': 419,\n",
       " b'1473560996': 420,\n",
       " b'4202858810': 421,\n",
       " b'2841517077': 422,\n",
       " b'98535486': 423,\n",
       " b'465881910': 424,\n",
       " b'1934335136': 425,\n",
       " b'641950279': 426,\n",
       " b'646715160': 427,\n",
       " b'3720632668': 428,\n",
       " b'2103344824': 429,\n",
       " b'82886770': 430,\n",
       " b'1826801459': 431,\n",
       " b'2962436572': 432,\n",
       " b'3084829721': 433,\n",
       " b'2184211588': 434,\n",
       " b'972005658': 435,\n",
       " b'3145096544': 436,\n",
       " b'3461621159': 437,\n",
       " b'1636407015': 438,\n",
       " b'3910184716': 439,\n",
       " b'1806509997': 440,\n",
       " b'779843977': 441,\n",
       " b'774583783': 442,\n",
       " b'4127802244': 443,\n",
       " b'3023628065': 444,\n",
       " b'3287340168': 445,\n",
       " b'2114493120': 446,\n",
       " b'1965000814': 447,\n",
       " b'786960145': 448,\n",
       " b'2312345356': 449,\n",
       " b'2863056708': 450,\n",
       " b'2269626530': 451,\n",
       " b'628221713': 452,\n",
       " b'3701581814': 453,\n",
       " b'1372188915': 454,\n",
       " b'4125421459': 455,\n",
       " b'2420516619': 456,\n",
       " b'916627849': 457,\n",
       " b'206554895': 458,\n",
       " b'2773966086': 459,\n",
       " b'3206596283': 460,\n",
       " b'2388010027': 461,\n",
       " b'714296247': 462,\n",
       " b'2796594102': 463,\n",
       " b'3025468453': 464,\n",
       " b'41623329': 465,\n",
       " b'650247487': 466,\n",
       " b'1338321138': 467,\n",
       " b'1065083553': 468,\n",
       " b'2356200472': 469,\n",
       " b'1595241082': 470,\n",
       " b'1773363754': 471,\n",
       " b'2168184460': 472,\n",
       " b'557557012': 473,\n",
       " b'1374196106': 474,\n",
       " b'438305330': 475,\n",
       " b'2507674183': 476,\n",
       " b'767550382': 477,\n",
       " b'3416255093': 478,\n",
       " b'2891200198': 479,\n",
       " b'1728898774': 480,\n",
       " b'104311536': 481,\n",
       " b'2724941310': 482,\n",
       " b'850767885': 483,\n",
       " b'3579514613': 484,\n",
       " b'3755202160': 485,\n",
       " b'124610391': 486,\n",
       " b'2165261948': 487,\n",
       " b'3021023405': 488,\n",
       " b'3969260761': 489,\n",
       " b'1295658867': 490,\n",
       " b'1369779823': 491,\n",
       " b'2519019736': 492,\n",
       " b'3314895954': 493,\n",
       " b'3788868841': 494,\n",
       " b'3709226646': 495,\n",
       " b'2191612778': 496,\n",
       " b'1467809542': 497,\n",
       " b'3892832138': 498,\n",
       " b'1085747802': 499,\n",
       " b'3768548263': 500,\n",
       " b'4268934123': 501,\n",
       " b'1124649470': 502,\n",
       " b'4251121485': 503,\n",
       " b'1508951125': 504,\n",
       " b'3202696613': 505,\n",
       " b'881849271': 506,\n",
       " b'869185019': 507,\n",
       " b'2308372833': 508,\n",
       " b'3846957590': 509,\n",
       " b'2978630628': 510,\n",
       " b'2047026922': 511,\n",
       " b'223675975': 512,\n",
       " b'1089632536': 513,\n",
       " b'303459881': 514,\n",
       " b'4121771254': 515,\n",
       " b'2605913428': 516,\n",
       " b'3858621747': 517,\n",
       " b'4112286860': 518,\n",
       " b'3444116390': 519,\n",
       " b'234496454': 520,\n",
       " b'1638217350': 521,\n",
       " b'1672427463': 522,\n",
       " b'3931516549': 523,\n",
       " b'655841147': 524,\n",
       " b'1997149065': 525,\n",
       " b'3731945988': 526,\n",
       " b'1462062430': 527,\n",
       " b'256545806': 528,\n",
       " b'2063177201': 529,\n",
       " b'63434504': 530,\n",
       " b'2138103664': 531,\n",
       " b'3221725438': 532,\n",
       " b'4199168774': 533,\n",
       " b'658933673': 534,\n",
       " b'496222761': 535,\n",
       " b'4251372836': 536,\n",
       " b'2915789051': 537,\n",
       " b'2962515012': 538,\n",
       " b'2587616435': 539,\n",
       " b'1298668930': 540,\n",
       " b'2953099360': 541,\n",
       " b'997439313': 542,\n",
       " b'3583375355': 543,\n",
       " b'1964850967': 544,\n",
       " b'2750873665': 545,\n",
       " b'3357710960': 546,\n",
       " b'2489333364': 547,\n",
       " b'2320431814': 548,\n",
       " b'3898592643': 549,\n",
       " b'2992424940': 550,\n",
       " b'1143513079': 551,\n",
       " b'2691609942': 552,\n",
       " b'2211704221': 553,\n",
       " b'1395819594': 554,\n",
       " b'388117170': 555,\n",
       " b'1747799199': 556,\n",
       " b'1801234307': 557,\n",
       " b'3221268576': 558,\n",
       " b'874736953': 559,\n",
       " b'389805714': 560,\n",
       " b'4181313745': 561,\n",
       " b'4248391618': 562,\n",
       " b'1583917689': 563,\n",
       " b'2691131073': 564,\n",
       " b'3107909715': 565,\n",
       " b'812528842': 566,\n",
       " b'2963144429': 567,\n",
       " b'859015855': 568,\n",
       " b'1576119037': 569,\n",
       " b'439311946': 570,\n",
       " b'3206828033': 571,\n",
       " b'1260659849': 572,\n",
       " b'4184715643': 573,\n",
       " b'1097776174': 574,\n",
       " b'3545475739': 575,\n",
       " b'3434354867': 576,\n",
       " b'2505330132': 577,\n",
       " b'473531767': 578,\n",
       " b'816384882': 579,\n",
       " b'1370244570': 580,\n",
       " b'4250931903': 581,\n",
       " b'1738737666': 582,\n",
       " b'95873940': 583,\n",
       " b'883329724': 584,\n",
       " b'3979413296': 585,\n",
       " b'1082833698': 586,\n",
       " b'2105086267': 587,\n",
       " b'3367174753': 588,\n",
       " b'2511625066': 589,\n",
       " b'1215834685': 590,\n",
       " b'3671450770': 591,\n",
       " b'25712446': 592,\n",
       " b'2057354451': 593,\n",
       " b'2353399295': 594,\n",
       " b'2106690920': 595,\n",
       " b'151676688': 596,\n",
       " b'3842362115': 597,\n",
       " b'3786887622': 598,\n",
       " b'3439001309': 599,\n",
       " b'296367110': 600,\n",
       " b'2662111509': 601,\n",
       " b'3089753442': 602,\n",
       " b'1841881977': 603,\n",
       " b'2708587571': 604,\n",
       " b'784441822': 605,\n",
       " b'2730018676': 606,\n",
       " b'2689458299': 607,\n",
       " b'3544663643': 608,\n",
       " b'896838988': 609,\n",
       " b'2275514332': 610,\n",
       " b'2103350077': 611,\n",
       " b'1744785223': 612,\n",
       " b'833858713': 613,\n",
       " b'3436774163': 614,\n",
       " b'552669137': 615,\n",
       " b'1766166660': 616,\n",
       " b'3848300978': 617,\n",
       " b'1579687184': 618,\n",
       " b'2280892735': 619,\n",
       " b'2321153776': 620,\n",
       " b'633659090': 621,\n",
       " b'3622325352': 622,\n",
       " b'3857367648': 623,\n",
       " b'1554701042': 624,\n",
       " b'679023125': 625,\n",
       " b'655514048': 626,\n",
       " b'1773358985': 627,\n",
       " b'116335087': 628,\n",
       " b'600398738': 629,\n",
       " b'1996087141': 630,\n",
       " b'1117966805': 631,\n",
       " b'934841988': 632,\n",
       " b'748520427': 633,\n",
       " b'3922768537': 634,\n",
       " b'3355352784': 635,\n",
       " b'1085259815': 636,\n",
       " b'2033068756': 637,\n",
       " b'417621163': 638,\n",
       " b'1494513322': 639,\n",
       " b'3802870894': 640,\n",
       " b'4202270768': 641,\n",
       " b'3853166763': 642,\n",
       " b'2170142135': 643,\n",
       " b'4284647357': 644,\n",
       " b'1951437492': 645,\n",
       " b'1603457279': 646,\n",
       " b'3078061160': 647,\n",
       " b'2250303690': 648,\n",
       " b'1939550439': 649,\n",
       " b'2608603914': 650,\n",
       " b'3365249126': 651,\n",
       " b'2579869852': 652,\n",
       " b'3374955799': 653,\n",
       " b'3633515900': 654,\n",
       " b'1520335024': 655,\n",
       " b'2763790767': 656,\n",
       " b'3520307709': 657,\n",
       " b'818179591': 658,\n",
       " b'868771423': 659,\n",
       " b'3763851020': 660,\n",
       " b'1701227599': 661,\n",
       " b'136622699': 662,\n",
       " b'2073339412': 663,\n",
       " b'1870423168': 664,\n",
       " b'2354590858': 665,\n",
       " b'3707599484': 666,\n",
       " b'426403871': 667,\n",
       " b'951441070': 668,\n",
       " b'1180987071': 669,\n",
       " b'2402098473': 670,\n",
       " b'3864843980': 671,\n",
       " b'2405148727': 672,\n",
       " b'861118590': 673,\n",
       " b'964696705': 674,\n",
       " b'1525541504': 675,\n",
       " b'848304112': 676,\n",
       " b'890019369': 677,\n",
       " b'3382116221': 678,\n",
       " b'450130761': 679,\n",
       " b'3273695084': 680,\n",
       " b'1967012522': 681,\n",
       " b'885074740': 682,\n",
       " b'2619961798': 683,\n",
       " b'229928069': 684,\n",
       " b'1137018984': 685,\n",
       " b'1151885141': 686,\n",
       " b'3315486330': 687,\n",
       " b'963733660': 688,\n",
       " b'53942731': 689,\n",
       " b'2844145532': 690,\n",
       " b'736179869': 691,\n",
       " b'1557270795': 692,\n",
       " b'1093614806': 693,\n",
       " b'1333829432': 694,\n",
       " b'3217192264': 695,\n",
       " b'2584839423': 696,\n",
       " b'1757355647': 697,\n",
       " b'2900719315': 698,\n",
       " b'3626702003': 699,\n",
       " b'1884840556': 700,\n",
       " b'1533293110': 701,\n",
       " b'3907601761': 702,\n",
       " b'1395068547': 703,\n",
       " b'1155431373': 704,\n",
       " b'2498764027': 705,\n",
       " b'561980641': 706,\n",
       " b'1369337496': 707,\n",
       " b'2693083714': 708,\n",
       " b'1498254766': 709,\n",
       " b'3946614613': 710,\n",
       " b'521837082': 711,\n",
       " b'1733435510': 712,\n",
       " b'1729385316': 713,\n",
       " b'3171602353': 714,\n",
       " b'1687011434': 715,\n",
       " b'1310511988': 716,\n",
       " b'1069076071': 717,\n",
       " b'1558108450': 718,\n",
       " b'4202477631': 719,\n",
       " b'421490602': 720,\n",
       " b'84598651': 721,\n",
       " b'831772458': 722,\n",
       " b'3729155908': 723,\n",
       " b'1325042066': 724,\n",
       " b'1823496423': 725,\n",
       " b'4208530995': 726,\n",
       " b'3102156677': 727,\n",
       " b'4084837460': 728,\n",
       " b'3274156698': 729,\n",
       " b'2492936252': 730,\n",
       " b'3488082754': 731,\n",
       " b'3411275264': 732,\n",
       " b'4011667261': 733,\n",
       " b'1417903688': 734,\n",
       " b'3971808383': 735,\n",
       " b'3658267924': 736,\n",
       " b'4102767305': 737,\n",
       " b'4232273219': 738,\n",
       " b'4051353547': 739,\n",
       " b'1804108516': 740,\n",
       " b'18401271': 741,\n",
       " b'3158451708': 742,\n",
       " b'2264041858': 743,\n",
       " b'724466045': 744,\n",
       " b'2186245955': 745,\n",
       " b'921689165': 746,\n",
       " b'2235428011': 747,\n",
       " b'1363299595': 748,\n",
       " b'158767969': 749,\n",
       " b'2106861684': 750,\n",
       " b'44040604': 751,\n",
       " b'1600524252': 752,\n",
       " b'3388135408': 753,\n",
       " b'3425994462': 754,\n",
       " b'1275755916': 755,\n",
       " b'2667740825': 756,\n",
       " b'438311948': 757,\n",
       " b'1537631261': 758,\n",
       " b'772318802': 759,\n",
       " b'1306613841': 760,\n",
       " b'3812153136': 761,\n",
       " b'1207717800': 762,\n",
       " b'2633697384': 763,\n",
       " b'1887678907': 764,\n",
       " b'2829792984': 765,\n",
       " b'3406437144': 766,\n",
       " b'2490299454': 767,\n",
       " b'309860528': 768,\n",
       " b'510454016': 769,\n",
       " b'1699259166': 770,\n",
       " b'2982847299': 771,\n",
       " b'3878670654': 772,\n",
       " b'3251235057': 773,\n",
       " b'186124210': 774,\n",
       " b'1116367368': 775,\n",
       " b'1655226972': 776,\n",
       " b'2514842060': 777,\n",
       " b'3642199129': 778,\n",
       " b'3463879879': 779,\n",
       " b'2402784975': 780,\n",
       " b'3098436124': 781,\n",
       " b'750863544': 782,\n",
       " b'3005193781': 783,\n",
       " b'3865317184': 784,\n",
       " b'3954432137': 785,\n",
       " b'4116595591': 786,\n",
       " b'935931560': 787,\n",
       " b'759623017': 788,\n",
       " b'110927946': 789,\n",
       " b'3335311287': 790,\n",
       " b'1787065236': 791,\n",
       " b'2438434714': 792,\n",
       " b'3830485621': 793,\n",
       " b'230148747': 794,\n",
       " b'3553843473': 795,\n",
       " b'2612400653': 796,\n",
       " b'1855699894': 797,\n",
       " b'3276095034': 798,\n",
       " b'2832529995': 799,\n",
       " b'3031347403': 800,\n",
       " b'1612428848': 801,\n",
       " b'2915662989': 802,\n",
       " b'3072473192': 803,\n",
       " b'4177226936': 804,\n",
       " b'2536605290': 805,\n",
       " b'1665697394': 806,\n",
       " b'2271049585': 807,\n",
       " b'2354259889': 808,\n",
       " b'710272215': 809,\n",
       " b'2083799098': 810,\n",
       " b'2291564875': 811,\n",
       " b'2400649681': 812,\n",
       " b'473110551': 813,\n",
       " b'2110476189': 814,\n",
       " b'1945950450': 815,\n",
       " b'1201353948': 816,\n",
       " b'941395439': 817,\n",
       " b'1970721704': 818,\n",
       " b'1784326680': 819,\n",
       " b'2187392019': 820,\n",
       " b'2020531758': 821,\n",
       " b'2533071551': 822,\n",
       " b'1983407490': 823,\n",
       " b'626269042': 824,\n",
       " b'3273774165': 825,\n",
       " b'1120641064': 826,\n",
       " b'734755850': 827,\n",
       " b'3677480269': 828,\n",
       " b'1329440639': 829,\n",
       " b'799356822': 830,\n",
       " b'2424116559': 831,\n",
       " b'2592239355': 832,\n",
       " b'392004313': 833,\n",
       " b'3859818831': 834,\n",
       " b'3534091101': 835,\n",
       " b'2301723344': 836,\n",
       " b'288749001': 837,\n",
       " b'3420605744': 838,\n",
       " b'3820740781': 839,\n",
       " b'2081433153': 840,\n",
       " b'2319080380': 841,\n",
       " b'1933542442': 842,\n",
       " b'1469291456': 843,\n",
       " b'2835333152': 844,\n",
       " b'1088257690': 845,\n",
       " b'432884667': 846,\n",
       " b'3796480562': 847,\n",
       " b'693857041': 848,\n",
       " b'3572215281': 849,\n",
       " b'2951450859': 850,\n",
       " b'2681949096': 851,\n",
       " b'698908517': 852,\n",
       " b'2852836897': 853,\n",
       " b'2166881970': 854,\n",
       " b'546889678': 855,\n",
       " b'3135799710': 856,\n",
       " b'3978835544': 857,\n",
       " b'3047117217': 858,\n",
       " b'4234558652': 859,\n",
       " b'3969026856': 860,\n",
       " b'1410614402': 861,\n",
       " b'3861842266': 862,\n",
       " b'3259367353': 863,\n",
       " b'2101810388': 864,\n",
       " b'2627346430': 865,\n",
       " b'2984524106': 866,\n",
       " b'2732029489': 867,\n",
       " b'3252432592': 868,\n",
       " b'1373597420': 869,\n",
       " b'2731275763': 870,\n",
       " b'4089322861': 871,\n",
       " b'1633829458': 872,\n",
       " b'3044687724': 873,\n",
       " b'3065202026': 874,\n",
       " b'4070309332': 875,\n",
       " b'3276897540': 876,\n",
       " b'3307369997': 877,\n",
       " b'3864376594': 878,\n",
       " b'41252927': 879,\n",
       " b'2876154686': 880,\n",
       " b'2402634095': 881,\n",
       " b'3399197241': 882,\n",
       " b'426166781': 883,\n",
       " b'3460711091': 884,\n",
       " b'1396446223': 885,\n",
       " b'1635377146': 886,\n",
       " b'2099198850': 887,\n",
       " b'1936956555': 888,\n",
       " b'577524735': 889,\n",
       " b'2643430993': 890,\n",
       " b'235695034': 891,\n",
       " b'445012721': 892,\n",
       " b'2724372163': 893,\n",
       " b'570504601': 894,\n",
       " b'4065675893': 895,\n",
       " b'3795440624': 896,\n",
       " b'1881331411': 897,\n",
       " b'1573329057': 898,\n",
       " b'4095724028': 899,\n",
       " b'122777568': 900,\n",
       " b'4074887767': 901,\n",
       " b'1824334392': 902,\n",
       " b'797444832': 903,\n",
       " b'2025136511': 904,\n",
       " b'1824709964': 905,\n",
       " b'1589462668': 906,\n",
       " b'2762265549': 907,\n",
       " b'2547977028': 908,\n",
       " b'491161389': 909,\n",
       " b'3181549607': 910,\n",
       " b'352094176': 911,\n",
       " b'1894439828': 912,\n",
       " b'3218575152': 913,\n",
       " b'1861716415': 914,\n",
       " b'592194601': 915,\n",
       " b'3806905204': 916,\n",
       " b'1418550524': 917,\n",
       " b'3968521389': 918,\n",
       " b'3406775761': 919,\n",
       " b'3433938884': 920,\n",
       " b'1040771467': 921,\n",
       " b'3244133531': 922,\n",
       " b'1043752342': 923,\n",
       " b'3075707957': 924,\n",
       " b'3094078712': 925,\n",
       " b'1145166049': 926,\n",
       " b'3129504825': 927,\n",
       " b'4130278250': 928,\n",
       " b'251609489': 929,\n",
       " b'3382118918': 930,\n",
       " b'2180806657': 931,\n",
       " b'3311537362': 932,\n",
       " b'3837372536': 933,\n",
       " b'1409742519': 934,\n",
       " b'4042920370': 935,\n",
       " b'2115636995': 936,\n",
       " b'3569923325': 937,\n",
       " b'111720609': 938,\n",
       " b'1384663745': 939,\n",
       " b'1397225485': 940,\n",
       " b'832111597': 941,\n",
       " b'3198700718': 942,\n",
       " b'2209057061': 943,\n",
       " b'1185540477': 944,\n",
       " b'3572977792': 945,\n",
       " b'2367415614': 946,\n",
       " b'2793834557': 947,\n",
       " b'3874905711': 948,\n",
       " b'2462397213': 949,\n",
       " b'376484644': 950,\n",
       " b'1522597700': 951,\n",
       " b'3922547783': 952,\n",
       " b'3512407626': 953,\n",
       " b'2710165417': 954,\n",
       " b'2844548954': 955,\n",
       " b'3699376073': 956,\n",
       " b'4058794884': 957,\n",
       " b'1742102536': 958,\n",
       " b'1023462832': 959,\n",
       " b'1002911800': 960,\n",
       " b'1332771610': 961,\n",
       " b'1977805725': 962,\n",
       " b'364721213': 963,\n",
       " b'2492206995': 964,\n",
       " b'2580858981': 965,\n",
       " b'4072954907': 966,\n",
       " b'1607345926': 967,\n",
       " b'3930077968': 968,\n",
       " b'3051234166': 969,\n",
       " b'3211583806': 970,\n",
       " b'3701465217': 971,\n",
       " b'406892307': 972,\n",
       " b'3722242006': 973,\n",
       " b'1379132639': 974,\n",
       " b'2775969504': 975,\n",
       " b'2107176778': 976,\n",
       " b'3383063101': 977,\n",
       " b'1167949960': 978,\n",
       " b'699952923': 979,\n",
       " b'4220107080': 980,\n",
       " b'943072239': 981,\n",
       " b'2505798415': 982,\n",
       " b'1873976153': 983,\n",
       " b'649028999': 984,\n",
       " b'2872096089': 985,\n",
       " b'3936430504': 986,\n",
       " b'4068367333': 987,\n",
       " b'3434302633': 988,\n",
       " b'4241216275': 989,\n",
       " b'116153421': 990,\n",
       " b'2311572043': 991,\n",
       " b'3155836085': 992,\n",
       " b'878192149': 993,\n",
       " b'4021003478': 994,\n",
       " b'3798356241': 995,\n",
       " b'1383584914': 996,\n",
       " b'688975255': 997,\n",
       " b'2769292935': 998,\n",
       " b'4014308675': 999,\n",
       " ...}"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eventIndex"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# user_friends.csv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "#读取数据\n",
    "\"\"\"\n",
    "  统计某个活动，参加和不参加的人数，计算活动热度\n",
    "\"\"\"\n",
    "\n",
    "#活动活跃度\n",
    "eventPopularity = ss.dok_matrix((n_events, 1))\n",
    "    \n",
    "f = open(\"event_attendees.csv\", 'rb')\n",
    "\n",
    "#字段：event_id,yes, maybe, invited, and no\n",
    "f.readline() # skip header\n",
    "\n",
    "for line in f:\n",
    "    cols = line.strip().split(b\",\")\n",
    "    eventId = cols[0]   #event_id\n",
    "    if eventId in eventIndex:\n",
    "        i = eventIndex[eventId]  #事件索引\n",
    "        \n",
    "        #yes - no\n",
    "        eventPopularity[i, 0] = len(cols[1].split(b\" \")) - len(cols[4].split(b\" \"))\n",
    "    \n",
    "f.close()\n",
    "    \n",
    "eventPopularity = normalize(eventPopularity, norm=\"l1\",\n",
    "      axis=0, copy=False)\n",
    "sio.mmwrite(\"EA_eventPopularity\", eventPopularity)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "matrix([[1.03523241e-04],\n",
       "        [2.99067141e-05],\n",
       "        [1.61036153e-05],\n",
       "        ...,\n",
       "        [3.68082635e-05],\n",
       "        [9.20206586e-06],\n",
       "        [6.90154940e-06]])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eventPopularity.todense()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
